Using Character-Level Sequence-to-Sequence Model for Word Level Text Generation to Enhance Arabic Speech Recognition
نویسندگان
چکیده
Owing to the linguistic richness of Arabic language, which contains more than 6000 roots, building a reliable language model for speech recognition systems faces many challenges. This paper introduces free automatic system Modern Standard based on an end-to-end-based Deep Speech architecture developed by Mozilla. The proposed uses character-level sequence-to-sequence map character alignment produced recognizer onto corresponding words. outperformed recent studies single-speaker and multi-speaker using two different state-of-the-art datasets. first was Multi-Genre Broadcast (MGB2) corpus with 1200 h audio data from multiple speakers. achieved new milestone in MGB2 challenge word error rate (WER) 3.2, outperforming related work same reduction 17%. An additional experiment 7-hour Saudi Accent Single Speaker Corpus (SASSC) used build single male speaker-based network architecture. experiments WER 4.25 relative improvement 33.8%.
منابع مشابه
Arabic Character Recognition using Approximate Stroke Sequence
Arabic character recognition of handwriting is addressed. A novel approach for the Arabic Character Recognition is presented based on statistical analysis of a typical Arabic text is presented. Results showed that the sub-word in Arabic language is the basic pictorial block rather than the word. The method of approximate stroke sequence is applied for the recognition of some Arabic characters i...
متن کاملCharacter-Level Linguistic Features Extraction for Text-to-Speech System
High quality linguistic features is the key to the success of speech synthesis. Traditional linguistic feature extraction methods are usually relied on a word-level natural language processing (NLP) parser. Since, a good parser requires a lot of feature engineering to build, it is usually a genral-purpose one and often not specially designed for speech synthesis. To avoid these difficulties, we...
متن کاملAn online sequence-to-sequence model for noisy speech recognition
Generative models have long been the dominant approach for speech recognition. The success of these models however relies on the use of sophisticated recipes and complicated machinery that is not easily accessible to non-practitioners. Recent innovations in Deep Learning have given rise to an alternative – discriminative models called Sequence-to-Sequence models, that can almost match the accur...
متن کاملMorphological Inflection Generation Using Character Sequence to Sequence Learning
Morphological inflection generation is the task of generating the inflected form of a given lemma corresponding to a particular linguistic transformation. We model the problem of inflection generation as a character sequence to sequence learning problem and present a variant of the neural encoder-decoder model for solving it. Our model is language independent and can be trained in both supervis...
متن کاملSequence to Sequence Learning for Optical Character Recognition
We propose an end-to-end recurrent encoder-decoder based sequence learning approach for printed text Optical Character Recognition (OCR). In contrast to present day existing state-of-art OCR solution [Graves et al. (2006)] which uses CTC output layer, our approach makes minimalistic assumptions on the structure and length of the sequence. We use a two step encoder-decoder approach – (a) A recur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2023
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2023.3302257